All Questions
8 questions
1vote
0answers
478views
How to (simply) architecture a way to ingest multiple types of large files, process them, and send data in chunks to web services?
Note: All of this would be in AWS Hi everyone, What would you guys suggest for building something that: Takes in several different input file types (ex: csv, json, jsonl, xml, .gz, ...) That can be ...
2votes
2answers
3kviews
From Oracle to Apache Parquet : how to handle eventual consistency?
I have an existing production Oracle Database. However, there are performance issues for certain kind of operations, because of the volume of the data, or the complexity of queries. That's why I ...
1vote
1answer
166views
BigData: time-based word count
What we are trying to do: We're trying to build a system that will count the number of unique entries for a certain timeframe. It's working ok until the data grows up or the timeframe increase, then ...
5votes
1answer
231views
Querying large amount of data for parallel processing [closed]
I have a dataset containing list of users (around 50M). Each user has an email address, name, and some more data columns. I want to send a weekly email to those users, and the content of the email ...
1vote
0answers
107views
Dealing with big data [closed]
I am on a project dealing with a lot of data in the form of images and videos (Data related to wind engineering). My requirement is to build a predictive algorithm based on the data I have. I have ...
4votes
4answers
296views
Approaches for storing and analysing large amounts of time-based data
I've been asked to develop a "telemetry" application that records data generated by a hardware device, which I read every 100ms. There are approx 250 data points (32-bit values), but only a subset ...
3votes
3answers
364views
Big Data: Can it be pre-processed?
My question is about "big data". Basically, big data involves the analysis of a large amount of data to make meaningful insights from it. I would like to know: Whether or not large amounts of data ...
0votes
1answer
1kviews
Developing an analytics's system processing large amounts of data - where to start
Imagine you're writing some sort of Web Analytics system - you're recording raw page hits along with some extra things like tagging cookies etc and then producing stats such as Which pages got most ...